Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dali usage #1639

Merged
merged 5 commits into from
Apr 12, 2021
Merged

Dali usage #1639

merged 5 commits into from
Apr 12, 2021

Conversation

andrei5055
Copy link
Contributor

@andrei5055 andrei5055 commented Mar 26, 2021

Script to run regular training gluonCV with and without Data Extension Library (DALI).:
To launch training use scripts/classification/imagenet/test.sh, which will install DALI.
Before launching this script you can set following environment variables:
MODEL, NUM_TRAINING_SAMPLES, NUM_EPOCHS, DATA_BACKEND, TRAIN_DATA_DIR
If some (or all) of these variables are NOT set, the script will use their default values:

  export MODEL=resnet18_v1
  export NUM_TRAINING_SAMPLES=1281167
  export NUM_EPOCHS=3
  export NUM_GPUS=0
  export DATA_BACKEND='mxnet'
  export TRAIN_DATA_DIR=~/.mxnet/datasets/imagenet

To launch this script with DALI, the environment variable DATA_BACKEND should be set to dali-gpu OR dali-cpu

Following charts shows
the improvement of performance when DALI is used:
image

and the accuracy achieved for 3 epochs:
image

@andrei5055
Copy link
Contributor Author

@zhreshold: You could start reviewing this PR. The log files you asked about are here

@andrei5055 andrei5055 mentioned this pull request Mar 26, 2021
@github-actions
Copy link

Job PR-1639-2023252 is done.
Docs are uploaded to http://gluon-vision-staging.s3-website-us-west-2.amazonaws.com/PR-1639/2023252/index.html

@github-actions
Copy link

Job PR-1639-dec5bc8 is done.
Docs are uploaded to http://gluon-vision-staging.s3-website-us-west-2.amazonaws.com/PR-1639/dec5bc8/index.html

Copy link
Member

@zhreshold zhreshold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for my delayed response, can you take a look at the installation and error catching issue again?

@@ -13,18 +12,29 @@
from gluoncv.model_zoo import get_model
from gluoncv.utils import makedirs, LRSequential, LRScheduler

import dali
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we catch the ImportError here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, I could add catching of this exception. Actually, it could happen only whentrain_imagenet.py was updated, but for some reason, dali.py was not downloaded. I will do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I added the ImportError exception which should happen when dali-gpu OR dali-cpu is used, but dali.py is not in scripts/classification/imagenet

scripts/classification/imagenet/train_imagenet.py Outdated Show resolved Hide resolved
@github-actions
Copy link

github-actions bot commented Apr 8, 2021

Job PR-1639-a0c1ee2 is done.
Docs are uploaded to http://gluon-vision-staging.s3-website-us-west-2.amazonaws.com/PR-1639/a0c1ee2/index.html

Copy link
Collaborator

@yinweisu yinweisu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested all three backends. LGTM except for the import issue.

scripts/classification/imagenet/train_imagenet.py Outdated Show resolved Hide resolved
scripts/classification/imagenet/train_imagenet.py Outdated Show resolved Hide resolved
@github-actions
Copy link

Job PR-1639-5dccefa is done.
Docs are uploaded to http://gluon-vision-staging.s3-website-us-west-2.amazonaws.com/PR-1639/5dccefa/index.html

Copy link
Collaborator

@yinweisu yinweisu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yinweisu yinweisu merged commit ab03ca0 into dmlc:master Apr 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants